Overview

Dataset statistics

Number of variables9
Number of observations706
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory49.8 KiB
Average record size in memory72.2 B

Variable types

NUM8
DATE1

Warnings

High is highly correlated with Open and 3 other fieldsHigh correlation
Open is highly correlated with High and 3 other fieldsHigh correlation
Low is highly correlated with Open and 3 other fieldsHigh correlation
Close is highly correlated with Open and 3 other fieldsHigh correlation
Adj Close is highly correlated with Open and 3 other fieldsHigh correlation
df_index has unique values Unique
Date has unique values Unique
Variation has 10 (1.4%) zeros Zeros

Reproduction

Analysis started2020-11-12 07:07:07.531129
Analysis finished2020-11-12 07:07:29.407701
Duration21.88 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct706
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean354.3031161
Minimum0
Maximum708
Zeros1
Zeros (%)0.1%
Memory size5.5 KiB

Quantile statistics

Minimum0
5-th percentile36.25
Q1177.25
median354.5
Q3531.5
95-th percentile672.75
Maximum708
Range708
Interquartile range (IQR)354.25

Descriptive statistics

Standard deviation204.7611829
Coefficient of variation (CV)0.5779265651
Kurtosis-1.200270634
Mean354.3031161
Median Absolute Deviation (MAD)177.5
Skewness0.0003630745736
Sum250138
Variance41927.14203
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
70810.1%
 
17710.1%
 
24110.1%
 
24010.1%
 
23910.1%
 
23810.1%
 
23710.1%
 
23610.1%
 
23510.1%
 
23410.1%
 
Other values (696)69698.6%
 
ValueCountFrequency (%) 
010.1%
 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
ValueCountFrequency (%) 
70810.1%
 
70710.1%
 
70610.1%
 
70510.1%
 
70410.1%
 

Date
Date

UNIQUE

Distinct706
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size5.5 KiB
Minimum2018-01-02 00:00:00
Maximum2020-11-10 00:00:00
Histogram with fixed size bins (bins=50)

Open
Real number (ℝ≥0)

HIGH CORRELATION

Distinct542
Distinct (%)76.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.4308357
Minimum11.07
Maximum30.889999
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB

Quantile statistics

Minimum11.07
5-th percentile16.10999975
Q119.9550005
median23.285001
Q327.02
95-th percentile29.860001
Maximum30.889999
Range19.819999
Interquartile range (IQR)7.0649995

Descriptive statistics

Standard deviation4.358424213
Coefficient of variation (CV)0.1860123245
Kurtosis-0.7895191096
Mean23.4308357
Median Absolute Deviation (MAD)3.575
Skewness-0.277854466
Sum16542.17001
Variance18.99586162
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
19.550.7%
 
27.54999950.7%
 
22.54000150.7%
 
26.79999940.6%
 
26.1740.6%
 
21.440.6%
 
26.29999940.6%
 
26.3540.6%
 
27.3540.6%
 
25.540.6%
 
Other values (532)66393.9%
 
ValueCountFrequency (%) 
11.0710.1%
 
11.7910.1%
 
12.1110.1%
 
12.5710.1%
 
12.9110.1%
 
ValueCountFrequency (%) 
30.88999920.3%
 
30.87999910.1%
 
30.8210.1%
 
30.79999910.1%
 
30.69000120.3%
 

High
Real number (ℝ≥0)

HIGH CORRELATION

Distinct539
Distinct (%)76.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.76616144
Minimum12.18
Maximum31.24
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB

Quantile statistics

Minimum12.18
5-th percentile16.549999
Q120.2625
median23.5550005
Q327.3475
95-th percentile30.25499975
Maximum31.24
Range19.06
Interquartile range (IQR)7.085

Descriptive statistics

Standard deviation4.319646858
Coefficient of variation (CV)0.181756186
Kurtosis-0.852538889
Mean23.76616144
Median Absolute Deviation (MAD)3.5149995
Skewness-0.2550146266
Sum16778.90998
Variance18.65934898
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
21.45999950.7%
 
25.55999940.6%
 
25.540.6%
 
2840.6%
 
25.9240.6%
 
22.62999930.4%
 
21.46999930.4%
 
18.430.4%
 
19.7730.4%
 
25.71999930.4%
 
Other values (529)67094.9%
 
ValueCountFrequency (%) 
12.1810.1%
 
12.2710.1%
 
13.0710.1%
 
13.510.1%
 
13.5410.1%
 
ValueCountFrequency (%) 
31.2410.1%
 
31.2310.1%
 
31.21999910.1%
 
31.0710.1%
 
31.04999910.1%
 

Low
Real number (ℝ≥0)

HIGH CORRELATION

Distinct547
Distinct (%)77.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.05358354
Minimum10.85
Maximum30.5
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB

Quantile statistics

Minimum10.85
5-th percentile15.53
Q119.6525
median22.959999
Q326.7074995
95-th percentile29.57499975
Maximum30.5
Range19.65
Interquartile range (IQR)7.0549995

Descriptive statistics

Standard deviation4.393272452
Coefficient of variation (CV)0.1905678761
Kurtosis-0.7102022948
Mean23.05358354
Median Absolute Deviation (MAD)3.514998
Skewness-0.3112842768
Sum16275.82998
Variance19.30084283
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
22.05999960.8%
 
17.6540.6%
 
22.5240.6%
 
26.62999930.4%
 
26.63999930.4%
 
25.8530.4%
 
27.36000130.4%
 
25.80999930.4%
 
26.130.4%
 
18.62000130.4%
 
Other values (537)67195.0%
 
ValueCountFrequency (%) 
10.8510.1%
 
10.8710.1%
 
11.0810.1%
 
11.2810.1%
 
11.8310.1%
 
ValueCountFrequency (%) 
30.510.1%
 
30.4810.1%
 
30.46999910.1%
 
30.45000110.1%
 
30.4210.1%
 

Close
Real number (ℝ≥0)

HIGH CORRELATION

Distinct548
Distinct (%)77.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.3967847
Minimum11.29
Maximum30.969999
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB

Quantile statistics

Minimum11.29
5-th percentile16.069999
Q119.98
median23.2449995
Q327
95-th percentile29.915
Maximum30.969999
Range19.679999
Interquartile range (IQR)7.02

Descriptive statistics

Standard deviation4.359761474
Coefficient of variation (CV)0.1863401972
Kurtosis-0.7840560115
Mean23.3967847
Median Absolute Deviation (MAD)3.5150005
Skewness-0.2766612282
Sum16518.13
Variance19.00752011
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
27.650.7%
 
26.2430.4%
 
26.5230.4%
 
21.6730.4%
 
22.7330.4%
 
27.3430.4%
 
28.05999930.4%
 
28.78000130.4%
 
27.5130.4%
 
2830.4%
 
Other values (538)67495.5%
 
ValueCountFrequency (%) 
11.2910.1%
 
11.510.1%
 
1210.1%
 
12.2110.1%
 
12.610.1%
 
ValueCountFrequency (%) 
30.96999910.1%
 
30.9110.1%
 
30.910.1%
 
30.80999910.1%
 
30.70000110.1%
 

Adj Close
Real number (ℝ≥0)

HIGH CORRELATION

Distinct659
Distinct (%)93.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.64901371
Minimum11.289165
Maximum30.80772
Zeros0
Zeros (%)0.0%
Memory size5.5 KiB

Quantile statistics

Minimum11.289165
5-th percentile15.446153
Q119.24320125
median22.8947725
Q326.18594925
95-th percentile29.615361
Maximum30.80772
Range19.518555
Interquartile range (IQR)6.942748

Descriptive statistics

Standard deviation4.422245294
Coefficient of variation (CV)0.19525112
Kurtosis-0.8498468137
Mean22.64901371
Median Absolute Deviation (MAD)3.423383
Skewness-0.1688080378
Sum15990.20368
Variance19.55625344
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
22.7330.4%
 
21.46957430.4%
 
27.10039520.3%
 
27.04160920.3%
 
17.94176920.3%
 
20.12999920.3%
 
27.09783920.3%
 
14.03632620.3%
 
17.10005220.3%
 
25.98345920.3%
 
Other values (649)68496.9%
 
ValueCountFrequency (%) 
11.28916510.1%
 
11.4991510.1%
 
11.99911210.1%
 
12.20909710.1%
 
12.59906910.1%
 
ValueCountFrequency (%) 
30.8077210.1%
 
30.69773110.1%
 
30.68773110.1%
 
30.54773910.1%
 
30.54498110.1%
 

Volume
Real number (ℝ≥0)

Distinct704
Distinct (%)99.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64859370.96
Minimum0
Maximum254813800
Zeros1
Zeros (%)0.1%
Memory size5.5 KiB

Quantile statistics

Minimum0
5-th percentile29218450
Q143095225
median56517400
Q375712325
95-th percentile130171950
Maximum254813800
Range254813800
Interquartile range (IQR)32617100

Descriptive statistics

Standard deviation33207988.44
Coefficient of variation (CV)0.5119998536
Kurtosis5.500376542
Mean64859370.96
Median Absolute Deviation (MAD)15810650
Skewness1.933092928
Sum4.57907159e+10
Variance1.102770496e+15
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
3705390020.3%
 
2857740020.3%
 
5765840010.1%
 
7462250010.1%
 
5310630010.1%
 
6149480010.1%
 
3333600010.1%
 
3769900010.1%
 
8679480010.1%
 
10709240010.1%
 
Other values (694)69498.3%
 
ValueCountFrequency (%) 
010.1%
 
1904990010.1%
 
2080990010.1%
 
2112440010.1%
 
2197020010.1%
 
ValueCountFrequency (%) 
25481380010.1%
 
24034380010.1%
 
23495120010.1%
 
22730760010.1%
 
21695470010.1%
 

Variation
Real number (ℝ)

ZEROS

Distinct420
Distinct (%)59.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-0.0340510085
Minimum-3.399999
Maximum1.51
Zeros10
Zeros (%)1.4%
Memory size5.5 KiB

Quantile statistics

Minimum-3.399999
5-th percentile-0.77000025
Q1-0.3200015
median-0.039999
Q30.24
95-th percentile0.7874995
Maximum1.51
Range4.909999
Interquartile range (IQR)0.5600015

Descriptive statistics

Standard deviation0.4916476821
Coefficient of variation (CV)-14.43856449
Kurtosis3.78152019
Mean-0.0340510085
Median Absolute Deviation (MAD)0.279999
Skewness-0.514276296
Sum-24.040012
Variance0.2417174433
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
0101.4%
 
-0.34101.4%
 
-0.02000181.1%
 
-0.03999981.1%
 
-0.2581.1%
 
0.1771.0%
 
-0.0971.0%
 
-0.21000150.7%
 
0.1850.7%
 
-0.13000150.7%
 
Other values (410)63389.7%
 
ValueCountFrequency (%) 
-3.39999910.1%
 
-2.29000110.1%
 
-1.7610.1%
 
-1.45999910.1%
 
-1.44999910.1%
 
ValueCountFrequency (%) 
1.5110.1%
 
1.48000210.1%
 
1.3310.1%
 
1.29999920.3%
 
1.19000110.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

Sample

First rows

df_indexDateOpenHighLowCloseAdj CloseVolumeVariation
002018-01-0216.19000116.54999916.19000116.54999915.35347733461800.00.359998
112018-01-0316.49000016.71999916.37000116.70000115.49263255940900.00.210001
222018-01-0416.78000116.95999916.62000116.73000015.52046537064900.0-0.050001
332018-01-0516.70000116.86000116.57000016.83000015.61323626958200.00.129999
442018-01-0816.74000017.03000116.70999917.03000115.79877628400000.00.290001
552018-01-0917.03000117.16000016.95999917.03000115.79877635070900.00.000000
662018-01-1016.92000017.04999916.77000016.79999915.58540428547700.0-0.120001
772018-01-1116.87999917.29999916.84000017.25000016.00287137921500.00.370001
882018-01-1217.04000117.41000017.02000017.29999916.04925345912100.00.259998
992018-01-1517.32000017.44000117.15000017.35000016.09564228945400.00.030000

Last rows

df_indexDateOpenHighLowCloseAdj CloseVolumeVariation
6966992020-10-2720.26000020.37000119.80999919.87999919.87999948978400.0-0.380001
6977002020-10-2819.35000019.44000118.67000018.67000018.67000079247400.0-0.680000
6987012020-10-2918.43000019.37000117.74000019.29000119.29000186794800.00.860001
6997022020-10-3019.13999919.54000118.87000118.94000118.94000164185100.0-0.199998
7007032020-11-0319.51000019.90000019.33000019.65000019.65000066292400.00.140000
7017042020-11-0420.01000020.07000019.27000019.71999919.71999965777600.0-0.290001
7027052020-11-0519.95000119.99000019.71999919.88999919.88999940874800.0-0.060002
7037062020-11-0619.54999919.95000119.54000119.71999919.71999925635300.00.170000
7047072020-11-0921.11000122.69000121.04000121.61000121.610001164904100.00.500000
7057082020-11-1021.88999923.15000021.80999923.08000023.080000163732600.01.190001